Terminology Extraction and Term Ranking for Standardizing Term Banks

نویسندگان

  • Magnus Merkel
  • Jody Foo
چکیده

This paper presents how word alignment techniques could be used for building standardized term banks. It is shown that time and effort could be saved by a relatively simple evaluation metric based on frequency data from term pairs, and source and target distributions inside the alignment results. The proposed Q-value metric is shown to outperform other tested metrics such as Dice’s coefficient, and simple pair frequency.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

TermSuite: Terminology Extraction with Term Variant Detection

We introduce, TermSuite, a JAVA and UIMA-based toolkit to build terminologies from corpora. TermSuite follows the classic two steps of terminology extraction tools, the identification of term candidates and their ranking, but implements new features. It is multilingually designed, scalable, and handles term variants. We focus on the main components: UIMA Tokens Regex for defining term and varia...

متن کامل

A Multi-Word Term Extraction Program for Arabic Language

Terminology extraction commonly includes two steps: identification of term-like units in the texts, mostly multi-word phrases, and the ranking of the extracted term-like units according to their domain representativity. In this paper, we design a multi-word term extraction program for Arabic language. The linguistic filtering performs a morphosyntactic analysis and takes into account several ty...

متن کامل

Some Considerations on Guidelines for Bilingual Alignment and Terminology Extraction

Despite progress in the development of computational means, human input is still critical in the production of consistent and useable aligned corpora and term banks. This is especially true for specialized corpora and term banks whose end-users are often professionals with very stringent requirements for accuracy, consistency and coverage. In the compilation of a high quality Chinese-English le...

متن کامل

Term formation as the object of analysis of various terminology systems (on the basis of analysis of aerospace terminology in Russian language)

This article is dedicated to the study of the method of various term system analysis from term formation perspective. Herewith as the simple of analysis is studied aerospace terminology in Russian language. The main ways of term formation are divided into four groups: synthetic way, adoption, semantic metaphorization, analytic way. Each way and the nuances of its analysis are explained in detai...

متن کامل

Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Extraction

In this paper, we present two terminology extraction tools in order to compare a knowledge-poor and a knowledge-rich approach. Both tools process single and multi-word terms and are designed to handle multilingualism. We run an evaluation on six languages and two di erent domains using crawled comparable corpora and hand-crafted reference term lists. We discuss the three main results achieved f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007